Search Results

Documents authored by Arora, Sanjeev


Document
Invited Talk
The Quest for Mathematical Understanding of Deep Learning (Invited Talk)

Authors: Sanjeev Arora

Published in: LIPIcs, Volume 182, 40th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2020)


Abstract
Deep learning has transformed Machine Learning and Artificial Intelligence in the past decade. It raises fundamental questions for mathematics and theory of computer science, since it relies upon solving large-scale nonconvex problems via gradient descent and its variants. This talk will be an introduction to mathematical questions raised by deep learning, and some partial understanding obtained in recent years.

Cite as

Sanjeev Arora. The Quest for Mathematical Understanding of Deep Learning (Invited Talk). In 40th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 182, p. 1:1, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{arora:LIPIcs.FSTTCS.2020.1,
  author =	{Arora, Sanjeev},
  title =	{{The Quest for Mathematical Understanding of Deep Learning}},
  booktitle =	{40th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2020)},
  pages =	{1:1--1:1},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-174-0},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{182},
  editor =	{Saxena, Nitin and Simon, Sunil},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2020.1},
  URN =		{urn:nbn:de:0030-drops-132427},
  doi =		{10.4230/LIPIcs.FSTTCS.2020.1},
  annote =	{Keywords: machine learning, artificial intelligence, deep learning, gradient descent, optimization}
}
Document
Invited Talk
Overcoming Intractability in Unsupervised Learning (Invited Talk)

Authors: Sanjeev Arora

Published in: LIPIcs, Volume 30, 32nd International Symposium on Theoretical Aspects of Computer Science (STACS 2015)


Abstract
Unsupervised learning - i.e., learning with unlabeled data - is increasingly important given today's data deluge. Most natural problems in this domain - e.g. for models such as mixture models, HMMs, graphical models, topic models and sparse coding/dictionary learning, deep learning - are NP-hard. Therefore researchers in practice use either heuristics or convex relaxations with no concrete approximation bounds. Several nonconvex heuristics work well in practice, which is also a mystery. The talk will describe a sequence of recent results whereby rigorous approaches leading to polynomial running time are possible for several problems in unsupervised learning. The proof of polynomial running time usually relies upon nondegeneracy assumptions on the data and the model parameters, and often also on stochastic properties of the data (average-case analysis). We describe results for topic models, sparse coding, and deep learning. Some of these new algorithms are very efficient and practical - e.g. for topic modeling.

Cite as

Sanjeev Arora. Overcoming Intractability in Unsupervised Learning (Invited Talk). In 32nd International Symposium on Theoretical Aspects of Computer Science (STACS 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 30, p. 1, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)


Copy BibTex To Clipboard

@InProceedings{arora:LIPIcs.STACS.2015.1,
  author =	{Arora, Sanjeev},
  title =	{{Overcoming Intractability in Unsupervised Learning}},
  booktitle =	{32nd International Symposium on Theoretical Aspects of Computer Science (STACS 2015)},
  pages =	{1--1},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-78-1},
  ISSN =	{1868-8969},
  year =	{2015},
  volume =	{30},
  editor =	{Mayr, Ernst W. and Ollinger, Nicolas},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2015.1},
  URN =		{urn:nbn:de:0030-drops-49581},
  doi =		{10.4230/LIPIcs.STACS.2015.1},
  annote =	{Keywords: machine learning, unsupervised learning, intractability, NP-hardness}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail